71 research outputs found
Efficient Hardware Architecture for Correlation-Based Spike Detection and Unsupervised Clustering
This chapter presents a novel hardware architecture for correlation-based spike detection and unsupervised clustering. The architecture is able to utilize the information extracted from the results of spike clustering for efficient spike detection. The architecture supports the fast computation for the normalized correlation and OSORT operations. The normalized correlation is used for template matching for accurate spike detection. The OSORT algorithm is adopted for unsupervised classification of the detected spikes. The mean of spikes of each cluster produced by the OSORT algorithm is used as the templates for subsequent detection. The architecture adopts postnormalization technique for reducing the area costs. Modified OSORT operations are also proposed for facilitating unsupervised clustering by hardware. The proposed architecture is implemented by field programmable gate array (FPGA) for performance evaluation. In addition to attaining high detection and classification accuracy for spike sorting, experimental results reveal that the proposed architecture is an efficient design providing low area cost and high throughput for real-time offline spike sorting applications
Robust Feature Learning Against Noisy Labels
Supervised learning of deep neural networks heavily relies on large-scale
datasets annotated by high-quality labels. In contrast, mislabeled samples can
significantly degrade the generalization of models and result in memorizing
samples, further learning erroneous associations of data contents to incorrect
annotations. To this end, this paper proposes an efficient approach to tackle
noisy labels by learning robust feature representation based on unsupervised
augmentation restoration and cluster regularization. In addition, progressive
self-bootstrapping is introduced to minimize the negative impact of supervision
from noisy labels. Our proposed design is generic and flexible in applying to
existing classification architectures with minimal overheads. Experimental
results show that our proposed method can efficiently and effectively enhance
model robustness under severely noisy labels
Efficient Fuzzy C-Means Architecture for Image Segmentation
This paper presents a novel VLSI architecture for image segmentation. The architecture is based on the fuzzy c-means algorithm with spatial constraint for reducing the misclassification rate. In the architecture, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. In addition, an efficient pipelined circuit is used for the updating process for accelerating the computational speed. Experimental results show that the the proposed circuit is an effective alternative for real-time image segmentation with low area cost and low misclassification rate
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy
This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system
Recent Progress in Parallel and Distributed Computing
Parallel and distributed computing has been one of the most active areas of research in recent years. The techniques involved have found significant applications in areas as diverse as engineering, management, natural sciences, and social sciences. This book reports state-of-the-art topics and advances in this emerging field. Completely up-to-date, aspects it examines include the following: 1) Social networks; 2) Smart grids; 3) Graphic processing unit computation; 4) Distributed software development tools; 5) Analytic hierarchy process and the analytic network proces
Efficient Architecture for Spike Sorting in Reconfigurable Hardware
This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation
Recommended from our members
Some results on source coding, vector quantization and progressive image transmission
In this dissertation, we investigate three problems in data compression, namely, source coding, vector quantization, and progressive image transmission. In source coding, we find the optimal performance of a data compression system for cyclostationary Gaussian sources. This is accomplished by first defining a rate distortion function, R(D), for cyclostationary sources, and then proving the source coding theorem and its converse for these sources. These theorems state the following facts: given a distortion there exists a code with rate arbitrarily close to the R(D); and, there exists no code with rate less than R(D). From these two theorems, we conclude that the rate distortion function defined in this dissertation is the optimal performance for the cyclostationary Gaussian sources. In vector quantization, we develop a new tree-structured vector quantizer (VQ) design algorithm which operates under rate and storage constraints. This new algorithm has the following features: first, it has low computation complexity; second, it can control storage complexity; third, its performance is close to the performance of the best known VQ. The design of this VQ uses tree-growing approach and grows the tree one stage (layer) at a time. Before the design, the storage and rate constraints at each stage are specified. Then, we minimize the distortion at each stage under these constraints. The minimization of the distortion involves the optimal rate and storage allocation, which is implemented through the use of the dynamic programming technique. As the third problem in data compression, we develop a new progressive image transmission (PIT) design algorithm in which the resolution and resource constraints at each stage can be specified/controlled. The type of resource can be the rate or distortion, and the storage size. The first step of the design of this algorithm is to use the wavelet transform to obtain a pyramid structure representation of an image. Then, at the design of each stage, we optimally allocate the resources available at the current stage to all the subimages which constitute the image to be reconstructed. The resources allocated to the subimages are then used to design the first or the subsequent layers of the TSVQ\u27s to successively refine these subimages. Since the existing PIT algorithms can only control the rate or the resolution (but not both) at each stage, this algorithm is a generalization of the existing algorithms and can be used in a wider range of applications
- …